Letcher, Ned, Rebecca Dridan and Timothy Baldwin (2015) gDelta: A Missing Link in the Grammar Engineering Toolchain, Language Resources and Evaluation

نویسندگان

Ned Letcher

Rebecca Dridan

Timothy Baldwin

چکیده

The development of precision grammars is an inherently resource-intensive process; their complexity means that changes made to one area of a grammar often introduce unexpected flow-on effects elsewhere in the grammar which may only be discovered after some time has been invested in updating numerous test suite items. In this paper, we present the browser-based gDelta tool, which aims to provide grammar engineers with more immediate feedback on the impact of changes made to a grammar by comparing parser output from two different grammar versions. We describe an attribute weighting algorithm for highlighting components of the grammar that have been strongly impacted by a modification to the grammar, as well as a technique for clustering test suite items whose parsability has changed, in order to locate related groups of effects. These two techniques are used to present the grammar engineer with different views on the grammar to inform them of different aspects of change in a data-driven manner.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Methods and Resources for Discriminating Similar Languages

The Discriminating between Similar Languages (DSL) shared task at VarDial challenged participants to build an automatic language identification system to discriminate between 13 languages in 6 groups of highly-similar languages (or national varieties of the same language). In this paper, we describe the submissions made by team UniMelb-NLP, which took part in both the closed and open categories...

متن کامل

Unsupervised Parse Selection for HPSG

Parser disambiguation with precision grammars generally takes place via statistical ranking of the parse yield of the grammar using a supervised parse selection model. In the standard process, the parse selection model is trained over a hand-disambiguated treebank, meaning that without a significant investment of effort to produce the treebank, parse selection is not possible. Furthermore, as t...

متن کامل

From Database to Treebank: On Enhancing Hypertext Grammars with Grammar Engineering and Treebank Search

This paper describes how electronic grammars can be further enhanced by adding machine-readable grammars and treebanks. We explore the potential benefits of implemented grammars and treebanks for descriptive linguistics, following the discursive methodology of Bird & Simons (2003) and the values and maxims identified by Nordhoff (2008).1 We describe the resources which we believe make implement...

متن کامل

Treeblazing: Using External Treebanks to Filter Parse Forests for Parse Selection and Treebanking

We describe “treeblazing”, a method of using annotations from the GENIA treebank to constrain a parse forest from an HPSG parser. Combining this with self-training, we show significant dependency score improvements in a task of adaptation to the biomedical domain, reducing error rate by 9% compared to out-of-domain gold data and 6% compared to self-training. We also demonstrate improvements in ...

متن کامل

Evaluating and Extending the Coverage of HPSG Grammars: A Case Study for German

In this work, we examine and attempt to extend the coverage of a German HPSG grammar. We use the grammar to parse a corpus of newspaper text and evaluate the proportion of sentences which have a correct attested parse, and analyse the cause of errors in terms of lexical or constructional gaps which prevent parsing. Then, using a maximum entropy model, we evaluate prediction of lexical types in ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Letcher, Ned, Rebecca Dridan and Timothy Baldwin (2015) gDelta: A Missing Link in the Grammar Engineering Toolchain, Language Resources and Evaluation

نویسندگان

چکیده

منابع مشابه

Exploring Methods and Resources for Discriminating Similar Languages

Unsupervised Parse Selection for HPSG

From Database to Treebank: On Enhancing Hypertext Grammars with Grammar Engineering and Treebank Search

Treeblazing: Using External Treebanks to Filter Parse Forests for Parse Selection and Treebanking

Evaluating and Extending the Coverage of HPSG Grammars: A Case Study for German

عنوان ژورنال:

اشتراک گذاری